Refine your search
Collections
Co-Authors
Year
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Verma, Vikas
- An Automated Error Detection System for Indian Language Using Statistical Approach
Abstract Views :124 |
PDF Views:0
Authors
Affiliations
1 Assistant Professor, Department of Computer Science and Applications, Maharishi Markandeshvar Engineering college, Mullana, Ambala, IN
2 Research Scholar, Department of Computer Science and Applications, DAV University, Jalandhar, IN
3 Associate Professor, Department of Computer Science and Applications, DAV University, Jalandhar, IN
1 Assistant Professor, Department of Computer Science and Applications, Maharishi Markandeshvar Engineering college, Mullana, Ambala, IN
2 Research Scholar, Department of Computer Science and Applications, DAV University, Jalandhar, IN
3 Associate Professor, Department of Computer Science and Applications, DAV University, Jalandhar, IN
Source
Research Cell: An International Journal of Engineering Sciences, Vol 35, No SP (2023), Pagination: 146-152Abstract
Grammatical error detection system also called grammar checker or syntactic analyzer is one of the advance tool for natural language processing. This tool plays an important role in proof reading and for development of many other natural language processing applications like machine translation, summarization, question answering system etc. In this research article, we proposed a framework for detection of grammatical error using statistical approach. Further in statistical approach, we used N-gram approach for detection of the grammatical errors. Corpus used for generation of n-grams is taken from Indian Languages Corpora Initiative. This corpus is annotated by using morphological analyzer followed by part of speech tagger. Bi-gram, tri-gram and quad gram of part of speech tags are generated by using the annotated corpus. On testing the proposed algorithm on self-generated test data for Punjabi language, Overall accuracy was 100 percent, recall was 87.2, and the f-measure was 93.16,according to us.Keywords
Error Detection System, NLP, N-Gram, Syntactic Analyzer, Morphological Analyzer, POS Tagger.References
- .Bernth, A.: EasyEnglish: a tool for improving document quality. In: 5th Proceedings on Conference on Applied NLP natural language processing. ACL (Assoc. for Computational Linguistics), pp. 159-165. (1997).
- .Martins, R. T., Hasegawa, R., Montilha, G., & De Oliveira, O. N.: Linguistic issues in the development of ReGra: A grammar checker for Brazilian Portuguese. Natural Language Engineering, 4(04), 287-307 (1998).
- .Alam, M. J., Mumit, K., & Naushad, U.: N-gram based Statistical Grammar Checker for Bangla and English. In: 9 th International Proc. on Computer and IT (ICCIT), (2006).
- .Bigert, J., Kann, V., Knutsson, O., & Sjobergh, J.: Swedish Grammar checking for second language learners, 33-47(2004).
- .Ehsan, N., & Faili, H.: Towards grammar checker development for Persian language. IEEE International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE), 2010. pp. 1-8(2010).
- .Temesgen, A., & Assabie, Y.: Development of Grammar Checker for Amharic Using Morphological Features of Words and N-Gram Based Probabilistic Methods, IWPT (2013).
- .Henrich, V.: LIS Grammar Checker: Statistical Language Independent Grammar Checking(Doctoral dissertation, Reykjavík Univ.) (2009).
- .Hein, A. S.: A Grammar Checking Chart-Based Framework for Initial Studies. In: Proc. of 11th Nordic Conference in CL Computational Linguistic, pp. 68-80 (1998).
- .Schmidt, W.A.: German Grammar and style checking. In: Proceedings of CLAW, Vol. 98,(1998).
- . Ravin, Y.: Grammar Errors and Weaknesses in Style in Text-Critiquing System. In Natural Language Processing: The PLNLP Approach. Springer US, 65-76 (1993).
- . Young, S.C.: Improvement of Korean Proofreading System Using Corpus and CollocationRules. Language, pp. 328-333 (1998).
- . Carlberger, J., Kann, V., Domeij, R., & Knutsson, O.: A grammar checker for Swedish.Submitted to Computational. Linguistics, oktober (2002).
- . Carlberger, J., Kann, V., Domeij, R., & Knutsson, O.: Swedish grammar checker development and performance: A language engineering perspective. Natural languageengineering, 1(1) (2004).
- . Kabir, H., Zaman, J., Nayyer, S., & Hussain, S.: Two Pass Parsing Implementation GrammarChecker for Urdu. In: Proceedings of International Multi Topic Conference. Abstracts. INMIC 2002, pp. 51-51, IEEE, (2002).
- . Naber, D.: A style and grammar checker as rule-based. Thesis, Technical Faculty, Universityof Bielefeld, Germany, (2003).
- . Rider, Z.: POS tagging Grammar checking using rules matching. In: Proceedings of Conference on Class of 2005 on NLP Natural Language Processing. (2005).
- . Tesfaye, D.: An Afan Oromo Grammar rule-based Checker. IJACSA Editorial.(2011).
- . Jiang, Y., Wang, T., Lin, T., Wang, F., Cheng, W., Liu, X., & Zhang, W.: A Chinese spelling and rule based grammar detection system utility. In: Proceedings of IEEE International Conference on SSE System Science and Engineering (ICSSE), pp. 437-440. (2012).
- . Kasbon, R., Mahamad, S., Amran, N., & Mazlan, E.: Language sentence checker for Malay. World Appl. Sci. J.(Special Issue on Computer Applications and Knowledge Management), 12, 19-25 ( 2011).
- . Gill, M. S., & Lehal, G. S.: A Punjabi grammar checking system. In: Proceedings of 22nd International Conference on CL Computational Linguistics: Demonstration Papers. ACL, Association for Computational Linguistics. pp. 149-152 (2008).
- . Kinoshita, J., Menezes, C. E. D., & Salvador, L. N.: CoGrOO: a Portuguese - Brazilian CETENFOLHA Corpus based Grammar checker. In: Proceedings of 5th international conference on LRE, Language Resources and Evaluation, LREC. (2006).
- . Bopche, L., Kshirsagar, M., & Dhopavkar, G.: Rule Based Morphological Process GrammarChecking System for an Indian Language. In: Proceedings of 4th International Conference on GTISSA, Global Trends in Information Systems and Software Applications. (2011).
- . Nazar, R., & Renau, I.: N-gram corpus grammar checker for Google books. In: Proceedings of 2nd Workshop on CLW, Computational Linguistics and Writing. Cognitive and Linguistic Aspects of Document Engineering and Document Creation. Association for ComputationalLinguistics, pp. 27-34. (2012).
- . Gill, M. S., Joshi, S. S., & Lehal, G. S.: POS Part of speech tagging for Punjabi grammar checking. The Linguistic Journal, 4(1), 6-21(2009),
- . Ghosh, S., & Kristensson, P. O.: Text Correction using neural networks and completion in keyboard decoding, arXiv preprint arXiv: 1709.06429.(2017).
- . Smith, A.; Recurrent neural networks grammar inference. Department of Computer Sc., University of San Diego, California, www. cse. ucsd. edu/~ atsmith. (2003).
- . Huang, S., & Wang, H.:Bi-LSTM Chinese grammatical error diagnosis using neural networks. In: Proceedings of 3rd Workshop on NLP Natural Language Processing Techniques for Educational Applications (NLPTEA2016), pp. 148-154. (2016).
- . Lewis, G.: Recurrent Neural Networks and Sentence Correction. Department of Computer Sc., Stanford University. (2016).
- . Gudmundsson, J., & Menkes, F.: Natural Language Processing using Swedish using LSTM Long Short-term Memory Neural Networks: A ML-powered Grammar and Spell-checker for the Swedish Language. (2018).
- Part of Speech Tagger for Low Resource Indian Language Using Machine Learning Approach
Abstract Views :110 |
PDF Views:0
Authors
Vikas Verma
1,
S.K. Sharma
2
Affiliations
1 Research Scholar, Department of Computer Science and Applications, DAV University, Jalandhar, IN
2 Associate Professor, Department of Computer Science and Applications, DAV University, Jalandhar, IN
1 Research Scholar, Department of Computer Science and Applications, DAV University, Jalandhar, IN
2 Associate Professor, Department of Computer Science and Applications, DAV University, Jalandhar, IN
Source
Research Cell: An International Journal of Engineering Sciences, Vol 35, No SP (2023), Pagination: 153-166Abstract
In Language Processing, Part of Speech tagger is one of the fundamental components that are used as a preprocessor for a number of natural language processing tools. For every language before developing the advance tools, POS tagger is developed at the early stage. Various approaches are used for the development of POS tagger. In this research article, a comparative analysis of various Punjabi POS taggers developed by various researchers has been provided and an architecture using an efficient Machine Learning technique is proposed to enhance the accuracy of POS tagger. As all the researchers have used their own test data and not all the developed POS taggers are available online, therefore it is not feasible to test all the POS taggers on common test data set. The claimed results show that POS tagger developed using hybrid approach performs better as compare to rule based technique and other statistic techniques like N-gram, bigram and HMM.Keywords
Ambiguity, Part of Speech, POS, Punjabi, Rule Based Approach, Statistical Approach, Machine Learning, NLP.References
- . Gill, M. S., Lehal, G. S., & Joshi, S. S. (2009). Part of speech tagging for grammar checking of Punjabi. TheLinguistic Journal, 4(1), 6-21.
- . Sharma, S. K., & Lehal, G. S. (2011, June). Using Hidden Markov Model to improve the accuracy of Punjabi POS tagger. In Computer Science and Automation Engineering (CSAE), 2011 IEEE International Conference on (Vol. 2, pp. 697-701). IEEE.
- . Mittal, S., Sethi, N. S., & Sharma, S. K. (2014). Part of Speech Tagging of Punjabi Language using N Gram Model. International Journal of Computer Applications, 100(19).
- . Kaur, M., Aggerwal, M., & Sharma, S. K. (2014). Improving Punjabi Part of Speech Tagger by Using Reduced Tag Set. International Journal of Computer Applications & Information Technology, 7(2), 142.
- . Kashyap, D. K., & Josan, G. S. (2013, October). A trigram language model to predict part of speech tags using neural network. In International Conference on Intelligent Data Engineering and Automated Learning (pp. 513-520). Springer, Berlin, Heidelberg.
- . Singh, K. (2015). Part-of-Speech Tagging using Genetic Algorithms. International Journal of Simulation-- Systems, Science & Technology, 16(6).
- . Sood, S., Arora, V., & Sharma, S. K. (2014). Word Class Prediction of Ambiguous and Unknown Words ofPunjabi Language Using Bi-gram Methods. International Journal of Computer Applications & InformationTechnology, 7(2), 152.
- . Kanwar S.,Ravishankar, Sharma, S.K. (2011) POS tagging of Punjabi language Using Hidden Markov Model. Research Cell: International Journal of Engineering Sciences. pp 98-106.
- . Kumar, D., & Josan, G. (2016). Prediction of Part of Speech Tags for Punjabi using Support Vector Machines. International Arab Journal of Information Technology (IAJIT), 13(6).
- . Kumar D. and Josan G., “Developing a tagset for machine learning based POS tagging in Punjabi,” international Journal of Applied Research on Information Technology and Computing, vol. 3, no. 2, pp. 132-143, 2012.
- . http://tdildc.in/tdildcMain/articles/134692Draft%20POS%20Tag%20standard.pdf (Accessed on Oct 5, 2021).
- . Vijayalaxmi .F. Patil (2010), “Designing POS Tagset for Kannada, Linguistic Data Consortium for Indian Languages (LDC-IL), Organized by Central Institute of Indian Languages, Department of Higher EducationMinistry of Human Resource Development, Government of India, March 2010.
- . E. Alba, G. Luque, L. Araujo, Natural language tagging with genetic algorithms, Information Processing Letters 100 (5) (2006) pp. 173 – 182.
- . Sreeganesh, T. (2006). Telugu parts of speech tagging in WSD. Language of India, 6.
- . Milidiú, R. L., Santos, C. N., & Duarte, J. C. (2008). Phrase chunking using entropy guided transformation learning. Proceedings of ACL-08: HLT, 647-655.
- . Wilson, G., & Heywood, M. (2005, June). Use of a genetic algorithm in brill's transformation-based part-of-speech tagger. In Proceedings of the 7th annual conference on Genetic and evolutionary computation (pp. 2067-2073). ACM.
- . E. Brill, “Some advances in rule based part of speech tagging”, In Proceedings of The Twelfth 5ational Conference on Artificial Intelligence (AAAI94), Seattle, Washington, 1994.
- . Singh, J., Joshi, N., & Mathur, I. (2013, August). Development of Marathi part of speech tagger using statistical approach. In Advances in Computing, Communications and Informatics (ICACCI), 2013 International Conference on (pp. 1554-1559). IEEE.
- . Mishra, N., & Mishra, A. (2011, June). Part of speech tagging for Hindi corpus. In Communication Systems and Network Technologies (CSNT), 2011 International Conference on (pp. 554-558). IEEE.
- . Ali, H. (2010). An unsupervised parts-of-speech tagger for the bangla language. Department of Computer Science, University of British Columbia, 20, 1-8.
- . Antony, P. J., Mohan, S. P., & Soman, K. P. (2010, March). SVM based part of speech tagger for Malayalam. In Recent Trends in Information, Telecommunication and Computing (ITC), 2010 International Conferenceon (pp. 339-341). IEEE.
- . Dalal, A., Nagaraj, K., Sawant, U., & Shelke, S. (2006). Hindi part-of-speech tagging and chunking: A maximum entropy approach. Proceeding of the NLPAI Machine Learning Competition.
- . Agarwal, Himashu., and Mani,A. (2006), Part of Speech Tagging and Chunking with Conditional Random Fields. In the proceedings of NLPAI Contest, 2006.
- . Ekbal, A., & Bandyopadhyay, S. (2008). Web-based Bengali news corpus for lexicon development and POStagging. Polibits, (37), 21-30.
- . V Dhanalakshmi, M Anandkumar, MS Vijaya, R Loganathan, KP Soman, and S Rajendran. 2008. Tamil part-of-speech tagger based on svmtool. In Proceedings of the COLIPS International Conference on natural language processing (IALP), Chiang Mai, Thailand
- . Binulal, G. S., Goud, P. A., & Soman, K. P. (2009). A SVM based approach to Telugu parts of speech tagging using SVMTool. International Journal of Recent Trends in Engineering, 1(2), 183.
- . Antony, P. J., Mohan, S. P., & Soman, K. P. (2010, March). SVM based part of speech tagger for Malayalam. In Recent Trends in Information, Telecommunication and Computing (ITC), 2010 International Conferenceon (pp. 339-341). IEEE.
- . Shrivastava, M., & Bhattacharyya, P. (2008, December). Hindi pos tagger using naive stemming: Harnessingmorphological information without extensive linguistic knowledge. In International Conference on NLP (ICON08), Pune, India.
- . Manju, K., Soumya, S., & Idicula, S. M. (2009, October). Development of a POS tagger for Malayalam-an experience. In Advances in Recent Technologies in Communication and Computing, 2009. ARTCom'09. International Conference on (pp. 709-713). IEEE.
- . Saharia, N., Das, D., Sharma, U., & Kalita, J. (2009, August). Part of speech tagger for Assamese text. In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers (pp. 33-36). Association for Computational Linguistics.
- . Sharma, S. K., & Lehal, G. S. (2011, June). Using hidden markov model to improve the accuracy of punjabi pos tagger. In Computer Science and Automation Engineering (CSAE), 2011 IEEE International Conferenceon (Vol. 2, pp. 697-701). IEEE.
- . Ekbal, A., Mondal, S., & Bandyopadhyay, S. (2007). POS Tagging using HMM and Rule-based Chunking.The Proceedings of SPSAL, 8(1), 25-28.
- . Dalal, A., Nagaraj, K., Sawant, U., & Shelke, S. (2006). Hindi part-of-speech tagging and chunking: A maximum entropy approach. Proceeding of the NLPAI Machine Learning Competition.
- . Ekbal, A., & Bandyopadhyay, S. (2008). Web-based Bengali news corpus for lexicon development and POStagging. Polibits, (37), 21-30.
- . Agrawal, H. (2007). POS tagging and chunking for Indian languages. Shallow Parsing for South Asian Languages, 37.
- . Parikh, A. (2009). Part-of-speech tagging using neural network. Proceedings of ICON.
- . Arulmozhi, P., & Sobha, L. (2006). A Hybrid POS Tagger for a Relatively Free Word Order Language. In Proceedings of the First National Symposium on Modeling and Shallow Parsing of Indian Languages (pp. 79-85).
- . Patel, C., & Gali, K. (2008). Part-of-speech tagging for Gujarati using conditional random fields. InProceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages.
- . Singh, Umrinder and Goyal, Vishal (2017). Punjabi POS tagger: Rule Based and HMM. International journalof computer science and software Engineering.
- . http://punjabipos.learnpunjabi.org/ (Accessed on Oct 5, 2021).
- . http://punjabi.aglsoft.com/punjabi/?show=tagger (Accessed on Oct 5, 2021).
- . http://pgc.learnpunjabi.org/#Tagger (Accessed on Oct 5, 2021).
- . http://sanskrit.jnu.ac.in/ilci/index.jsp. (Accessed on Oct 5, 2021).
- . Todi, K. K., Mishra, P., & Sharma, D. M. (2018). Building a kannada pos tagger using machine learning andneural network models. arXiv preprint arXiv:1808.03175.
- . Sayami, S., Shahi, T. B., & Shakya, S. (2019). Nepali POS Tagging Using Deep Learning Approaches (No. 2073). EasyChair.
- . Kumar, S., Kumar, M. A., & Soman, K. P. (2019). Deep learning based part-of-speech tagging for Malayalam Twitter data (Special issue: deep learning techniques for natural language processing). Journal of IntelligentSystems, 28(3), 423-435.
- . Prabha, G., Jyothsna, P. V., Shahina, K. K., Premjith, B., & Soman, K. P. (2018, September). A deep learning approach for part-of-speech tagging in nepali language. In 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (pp. 1132-1136). IEEE.
- . Deshmukh, R. D., & Kiwelekar, A. (2020, March). Deep learning techniques for part of speech tagging by natural language processing. In 2020 2nd International Conference on Innovative Mechanisms for Industry Applications (ICIMIA) (pp. 76-81). IEEE.